Visualizing Effect Sizes for Science Communication

Which Plot Types and Enrichment Options Support Sense-Making?

Jürgen Schneider
Kirstin Schmidt, Kristina Bohrer, Samuel Merk

26 August 2023

Theory

Why effect sizes?

  • Clearinghouse” approaches
    Provide evidence base of systematic research (Knogler et al., 2022)



  • Effect sizes (ES) as one of the key information (Burns et al., 2011)



  • Scientists and clearinghouses usually use standardized textual ES metrics (Cohen, 1988)

Theory

Why visualizations?

Theory

Teacher-oriented science communication






information processing


comprehension


perceived practical relevance


(Jensen & Gerber, 2020)

Theory

Teacher-oriented science communication






information processing
- task difficulty
- efficiency
comprehension


perceived practical relevance


(Korbach et al., 2017; Marcus et al., 1996)

Theory

Teacher-oriented science communication






information processing
- task difficulty
- efficiency
comprehension
- Accuracy
- Sensitivity
perceived practical relevance


(Merk et al., in press)

Theory

Teacher-oriented science communication






information processing
- task difficulty
- efficiency
comprehension
- Accuracy
- Sensitivity
perceived practical relevance
- Perceived informativity
- Perceived value

(Lortie-Forgues et al., 2021)

Theory

Research interest & studies

Delphi-Study

Expert judgment on teacher-oriented visualizations


exploratory
Study 1

Comparison of the effect of different visualization types on sense-making

exploratory
Study 2

Comparison of the effect of different enrichment options on sense-making

confirmatory

Delphi study

  • 4 experts in data visualization, 4 experts in science communication in clearinghouses & transfer
  • phase 1: collection of 16 visualization types
    (for group values on a metric variable)

Delphi study

  • 4 experts in data visualization, 4 experts in science communication in clearinghouses & transfer
  • phase 1: collection of 16 visualization types
    (for group values on a metric variable)
  • phase 2: Rating and Ranking of 44 plots

“How accurately might teachers assess the ES depicted in the plot above?”
(7-point Likert scale; totally random - totally accurate)



Results: “Top ranked” visualizations

Study 1: Visualization types

Design

Study 1

Comparison of the effect of different visualization types on senst-making

exploratory
  • teachers (N = 40; bayesian updating)

  • 4 x 6 within-design
    • 4 visualization types
    • 6 ES (d= -.8 to .8)

  • randomizations
    • order of conditions
    • vignettes (1 of 4 between randomized)



Open Materials: github.com/j-5chneider/effsize_public

Study 1: Visualization types

Measures

Perceived task difficulty How difficult was it for you to understand the figure? (Marcus et al., 1996)
Efficiency [time taken to answer sensitivity and accuracy] own creation
Sensitivity Is one group superior to the other or are they approximately the same? (Merk et al., in press)
Accuracy
...abstract metric The group that reads on... tablet is entirely superior to the one with paper - paper is entirely superior to the one with tablet own creation
...Cohen's U₃ Look at the mean test score of the group reading on paper: What percentage of the group that reads on tablet has a higher test score than this value? (Grice & Barrett, 2014)
...overlap How much percent of the groups will overlap on the test score? own creation
Perceived informativity How informative do you perceive the way the information is presented in the figure? (Lortie-Forgues et al., 2021)
Perceived value To what extent are these results relevant for your future teaching? own creation

Demo: es-vis-demo1.formr.org

Study 1: Visualization types

Results

Bayesian multilevel analysis: dummy-coded visualization types


Visualization types have influence on…

  • Task difficulty:
  • Efficiency:
  • Accuracy:
    • Abstract metric:
    • Overlap:
    • Cohen’s U3:
  • Sensitivity:
  • Informativeness:
  • Value:

\((BF_{10} > 100)\)
\((BF_{10} > 100)\)


\((BF_{10} < 1/100)\)
\((BF_{10} > 100)\)
\((BF_{10} < 1/100)\)

\((BF_{10} = 16.62)\)
\((BF_{10} > 100)\)
\((BF_{10} > 100)\)

Open Data & Open Code:
github.com/j-5chneider/effsize_public

Study 1: Visualization types

Results




Type Task Difficulty Efficiency Accuracy overlap Sensitivity Informativity Value
Gardner-Altman (x-Achse) 3.587 23379.21 0.024 0.658 3.800 3.800
Halfeye (x-Achse) 4.233 17178.32 0.035 0.717 4.188 4.312
Halfeye (y-Achse) 4.554 16889.87 0.001 0.723 4.404 4.383
Raincloud (y-Achse) 3.829 20294.13 0.019 0.631 4.042 4.029

Study 1: Visualization types

Results

Discussion

  • Visualization type relevant predictor of successful communication of effect sizes


  • Halfeye plot as promising visualization type


  • Evidence of danger from misconceptions


  • Not stand-alone: context clearinghouses and teacher education/training


  • Limitation: Not (yet) compared to textual representations

Thank you!



Jürgen Schneider
ju.schneider@dipf.de


Cooperation


Albers, D., Correll, M., & Gleicher, M. (2014). Task-driven evaluation of aggregation in time series visualization. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 551–560. https://doi.org/10.1145/2556288.2557200
Baddeley, A. (1992). Working memory. Science, 255(5044), 556–559.
Baird, M. D., & Pane, J. F. (2019). Translating Standardized Effects of Education Programs Into More Interpretable Metrics. Educational Researcher, 48(4), 217–228. https://doi.org/10.3102/0013189X19848729
Brown, C., Schildkamp, K., & Hubers, M. D. (2017). Combining the best of two worlds: A conceptual proposal for evidence-informed school improvement. Educational Research, 59(2), 154–172. https://doi.org/10.1080/00131881.2017.1304327
Burns, P. B., Rohrich, R. J., & Chung, K. C. (2011). The levels of evidence and their role in evidence-based medicine. Plastic and Reconstructive Surgery, 128(1), 305–310. https://doi.org/10.1097/PRS.0b013e318219c171
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Taylor and Francis.
Farley-Ripple, E. N., Oliver, K., & Boaz, A. (2020). Mapping the community: Use of research evidence in policy and practice. Humanities and Social Sciences Communications, 7(1), 83. https://doi.org/10.1057/s41599-020-00571-2
Franconeri, S. L., Padilla, L. M., Shah, P., Zacks, J. M., & Hullman, J. (2021). The Science of Visual Data Communication: What Works. Psychological Science in the Public Interest, 22(3), 110–161. https://doi.org/10.1177/15291006211051956
Hanel, P. H. P., Maio, G. R., & Manstead, A. S. R. (2019). A new way to look at the data: Similarities between groups of people are large and important. Journal of Personality and Social Psychology, 116(4), 541–562. https://doi.org/10.1037/pspi0000154
Hanel, P. H. P., & Mehler, D. M. (2019). Beyond reporting statistical significance: Identifying informative effect sizes to improve scientific communication. Public Understanding of Science, 28(4), 468–485. https://doi.org/10.1177/0963662519834193
Hedges, L. V. (2018). Challenges in Building Usable Knowledge in Education. Journal of Research on Educational Effectiveness, 11(1), 1–21. https://doi.org/10.1080/19345747.2017.1375583
Jacowitz, K. E., & Kahneman, D. (1995). Measures of Anchoring in Estimation Tasks. Personality and Social Psychology Bulletin, 21(11), 1161–1166. https://doi.org/10.1177/01461672952111004
Jensen, E. A., & Gerber, A. (2020). Evidence-Based Science Communication. Frontiers in Communication, 4, 78. https://doi.org/10.3389/fcomm.2019.00078
Kale, A., Kay, M., & Hullman, J. (2020). Visual Reasoning Strategies for Effect Size Judgments and Decisions. https://doi.org/10.48550/ARXIV.2007.14516
Kim, Y.-S., Hofman, J. M., & Goldstein, D. G. (2022). Putting scientific results in perspective: Improving the communication of standardized effect sizes. CHI Conference on Human Factors in Computing Systems, 1–14. https://doi.org/10.1145/3491102.3502053
Knogler, M., Hetmanek, A., & Seidel, T. (2022). Determining an Evidence Base for Particular Fields of Educational Practice: A Systematic Review of Meta-Analyses on Effective Mathematics and Science Teaching. Frontiers in Psychology, 13, 873995. https://doi.org/10.3389/fpsyg.2022.873995
Korbach, A., Brünken, R., & Park, B. (2017). Measurement of cognitive load in multimedia learning: A comparison of different objective measures. Instructional Science, 45(4), 515–536. https://doi.org/10.1007/s11251-017-9413-5
Lipsey, M. W., Puzio, K., Yun, C., Herbert, M. A., Steinka-Fry, K., Cole, M. W., Roberts, M., Anthony, K. S., & Busick, M. D. (2012). Translating the Statistical Representation of the Effects of Education Interventions Into More Readily Interpretable Forms (Institute of Education Sciences, Ed.).
Lortie-Forgues, H., Sio, U. N., & Inglis, M. (2021). How Should Educational Effects Be Communicated to Teachers? Educational Researcher, 0013189X2098785. https://doi.org/10.3102/0013189X20987856
Marcus, N., Cooper, M., & Sweller, J. (1996). Understanding instructions. Journal of Educational Psychology, 88(1), 49–63. https://doi.org/10.1037/0022-0663.88.1.49
McPhetres, J., & Pennycook, G. (2020). Lay people are unimpressed by the effect sizes typically reported in psychological science [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/qu9hn
Pierce, R., & Chick, H. (2013). Workplace statistical literacy for teachers: Interpreting box plots. Mathematics Education Research Journal, 25(2), 189–205. https://doi.org/10.1007/s13394-012-0046-3
Schneider, S., Beege, M., Nebel, S., & Rey, G. D. (2018). A meta-analysis of how signaling affects learning with media. Educational Research Review, 23, 1–24. https://doi.org/10.1016/j.edurev.2017.11.001
Slavin, R. E. (2020). How evidence-based reform will transform research and practice in education. Educational Psychologist, 55(1), 21–31. https://doi.org/10.1080/00461520.2019.1611432
Thomm, E., Gold, B., Betsch, T., & Bauer, J. (2021). When preservice teachers’ prior beliefs contradict evidence from educational research. British Journal of Educational Psychology. https://doi.org/10.1111/bjep.12407

 

Icons:

Icons by Font Awesome CC BY 4.0

Theory

Why science communication?

Theory

State of research on visualization of statistical information (for laypeople)

Theory

What we know about visualizing data (in general)

Theory

State of research on visualization of statistical information (for laypeople)


Accuracy of estimation of statistical information: visualization type plays a role

. . .



Support in process: enrichment options

Study 1 | plot types: Results, Descriptives

Study 1 | plot types: Results efficiency

Study 1 | plot types: Results, Descriptives

Study 1 | plot types: Results, Descriptives

Study 1 | plot types: Results accuracy

Study 1 | plot types: Results accuracy

Study 1 | plot types: Results accuracy

Study 1 | plot types: Results accuracy

Study 1 | plot types: Results accuracy


without misconceptions

Study 2: Enrichment options

Design

Study 2

Comparison of the effect of different enrichment options on understanding.

confirmatory
  • Teachers (N = Bayesian updating)

  • 2 RCTs
    • Factor: visual benchmarking (yes vs. no)
    • Factor: signaling (difference, overlap, no signaling)

Study 2: Enrichment options

Design

  • Teachers (N = Bayesian updating)

  • 2 RCTs
    • Factor: visual benchmarking (yes vs. no)
    • Factor: signaling (difference, overlap, no signaling)

Study 2: Enrichment options

Design


  • Teachers (N = Bayesian updating)

  • 2 RCTs
    • Factor: visual benchmarking (yes vs. no)
    • Factor: signaling (difference, overlap, no signaling)

Study 2: Enrichment options

Design

  • increases accuracy (Kim et al., 2022, Schmidt et al., 2023).
  • increases task difficulty (Baddeley, 1992)
  • decreases effciency
  • increases informativeness
  • increases value

Studie 2: Anreicherungsoptionen

Design


  • increases accuracy (if no misconception)
  • reduces number of misconceptions
  • increases sensitivity
  • increases task difficulty
  • increases efficiency
  • increases informativeness
  • increases value